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Abstract. The ground-state energy Eo of a spin glass is an example of an extreme statistic. We consider 
the large deviations of this energy for a variety of models when the number of spins A'^ goes to infinity. 
In most cases, the behavior can be understood qualitatively, in particular with the help of semi-analytical 
results for hierarchical lattices. Particular attention is paid to the Sherrington-Kirkpatrick model; after 
comparing to the Tracy- Widom distribution which follows from the spherical approximation, we find that 
the large deviations give rise to non-trivial scaling laws with A*'. 
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1 Introduction 

Systems with quenched disorder have been studied inten- 
sively for the last two decades. Thermodynamic proper- 
ties such as the free energy in these systems fluctuate 
from sample to sample, but not very much: indeed, they 
are self-averaging if the disorder does not have long range 
correlations [1] . This means that typical values of the free 
energy density (to name but one quantity) deviate arbi- 
trarily little from a fixed value in the large volume limit. 
Because of this, little work has considered large deviations, 
i.e., the probability of finding a rare sample (realization 
of the disorder) for which the free energy density deviates 
from the typical value by a significant amount. 

In spin glasses, the archetype of disordered systems 
with frustration, the large deviations of thermodynamic 
quantities have been studied so far in only 3 very special 
cases: (1) the Random Energy Model [2] (REM) which can 
be treated completely because of its extreme simplicity; 
(2) the Generalized REM [3] for which bounds have been 
obtained [4] on the probability of large deviations; (3) the 
Sherrington-Kirkpatrick model [5] but with results [6] in 
the paramagnetic phase only. This last model is important 
because, in contrast to the first two cases, it is based on 
a microscopic Hamiltonian with spins, a necessary step to 
have a realistic model. In the study that follows, we work 
well in the spin-glass phase for a variety of realistic mod- 
els, focusing on the statistics of the system's ground-state 
energy Eq. In particular, we consider a nearly soluble case 
associated with hierarchical lattices that gives some qual- 
itative insights into the properties of the large deviations 
function. 

The outline of this paper is as follows. We first in- 
troduce some definitions and notation in Section 2, re- 
calling some general results on large deviations. In the 



rest of the paper, we tackle successively different mod- 
els of spin glasses. Section 3 covers a class of hierarchi- 
cal models for which both analytic and numerics can be 
pushed very far. Section 4 deals with the REM which is 
exactly solvable. These two models of spin glasses display 
the importance of summing/minimizing over random vari- 
ables. In Section 5 the properties of Eq in the Sherrington- 
Kirkpatrick model are examined numerically; first its dis- 
tribution is compared to the Tracy- Widom one, and then 
we investigate the scaling variables entering into different 
large deviations functions. In Section 6 we consider both 
Edwards- Anderson and mean field models of spin glasses 
for which the connectivity is finite. Our numerical analy- 
sis shows that their properties are close to those found in 
the hierarchical lattices. Overall conclusions are given in 
Section 7. 



2 General results on large deviations 

Consider a physical observable z^r of a system of N spins 
in the presence of quenched disorder; in our study zn will 
be the ground-state energy density, Eo/N. Usually, 
satisfies two convergence properties. First, the distribution 
of zn becomes peaked around z* at large N: 



Ve>0, P{\zN 



> £} as iV 



(1) 



A familiar context where this arises is when averaging a 
large number of independent identically distributed (i.i.d. 
hereafter) random variables: the weak law of large num- 
bers says that such an average becomes peaked [7] when 
N oo. Second, the sequence {zn} itself converges "al- 
most always" : if one successively adds spins, thereby in- 
creasing N and adding the corresponding coupling terms 
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to the Hamiltonian, typical sequences will converge to z*. 
In the context of averages of i.i.d. random variables, this is 
called the strong law of large numbers. (Note that within 
Eq. 1, the sequence {zj^} can deviate arbitrarily from z* 
as long as such deviations arise more and more rarely as 
N — > oo.) In physics, the terminology "self-averaging" 
usually refers to the weak convergence: if pi\T is the prob- 
ability density of zn, then converges to tlic Dirac dis- 
tribution 6{zn — z*). It is of interest to understand the 
nature of this convergence for physical observables; often, 
it turns out to be "exponential" . More precisely, one says 
that the distribution of z^r satisfies the Large Deviations 
Principle (LDP in what follows) if for all t one has the 
following large N asymptotics: 



PN{zN = t)^K{t,N) e-^/« 



(2) 



where is a slowly varying function compared to the ex- 
ponential. The function / is called the Large Deviations 
Function (LDF hereafter). In our numerical study, we con- 
sider 



fN{t) 



N 



ln[pjv(^iv = t)] 



(3) 



and then we estimate / via /at ^ / as ^ oo. More 
traditionally, the LDF is defined [8] from the distribu- 
tion function of Zjv which is generally a better quantity to 
consider mathematically. Introduce the two probabilities 

P{zn > t} for t > z* and P{zn < t} for t < z*; from 
these one defines the LDF / from the limits (if they exist): 



f{t)= lim -l-lnP{zN>t} t>z* 

N^oo IS 

fit) = lim -1 In P{ZN <t} t<z* . 

JV— >oo iV 



(4) 



In the next paragraphs we consider simple examples of the 
LDP. 



2.1 Sums of independent variables 

Consider first the case of averages of i.i.d. random vari- 
ables. Let {xi} be i.i.d. variables of probability density /x, 
and 



1 ^ 



(5) 



Cramer's theorem (see for instance ref. [8]) states that for 
any closed set F and any open set G 



lim 4 InPl^jv €F}<- inf f(t) 
lim l-lnP{zN €G}>-Mf{t) 

jV->oo tec 



(8) 



where lim and lim are the inf and sup limits. 

Still sharper asymptotics are provided by the Bahadur- 
Rao theorem if the first two moments of the finite. 
Without loss of generality, assume that the Xi have a zero 
mean and a unit variance. Let X{t) be the value of A such 
that the sup in Cramer's theorem is reached, i.e., f{t) = 
Xt -C{X). Then for all i > 0, we have 



JV 



lim P{zN > t}N 



i/2^Nfit) 



X{t)^2^C" [X{t)] 



(9) 



as well as the analogous relation for t < 0. 

Let us now apply this framework to one-dimensional 

spin glasses. Suppose one has a chain of N Ising spins 
[Si = ±1) with free boundary conditions described by the 
Hamiltonian 



(10) 



i=l 



where the Ji,i+i are i.i.d. random variables with probabil- 
ity density p. It is easy to see that the ground-state energy 
density is 



eo 



N-l 

J2 \Ji,i+l\ 
i=l 

N 



(11) 



leading one to the identification Xi = |Ji,i-i-i| and then 
Co = (1 — 1/N)z]\f^i. The direct application of Cramer's 
theorem gives the LDF of the ground-state energy density: 



m 



max 

X 



Xt-ln 



(12) 



This last result shows explicitly the non-universality of 
the large deviations function. To illustrate this, we can 
compute / for several specific cases. Consider first the 
discrete distribution 



i=l 



P{J) 



5{J) , 5{\J\ - 1) 



(13) 



For any A, define 



£(A) = In / n{x)e^''dx 



(6) 



This model's ground-state energy density is always nega- 
tive and its mean is —1/2. A simple computation shows 
that when t < one has exp(A) = (1 -|- t)/{—t) while 
A = oo for f > 0. This leads to 



and 



f{t)=\n2+\t\ In \t\ + {l + t) ln(l + t) 



(14) 



f{t) = max(At - £(A)) 



(7) 



for —1 < t < and f{t) = oo otherwise. Note that / 
is symmetric about —1/2 and that /(— 1/2) = as it 
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should. It is easy to see that any discrete J distribution 
leads to a bounded range of possible values of eo; clearly 
one has f{t) = oo outside of this range, but interestingly 
f{t) is finite at the boundary values. Finally, for the case 
considered in Eq. 13, we have /(O) = /(—I) = ln2 but 
the slope of / is infinite at those points. 

Consider now an exponential distribution: 



The LDF f{t) is of course even and can be obtained di- 
rectly from Eqs. 2 or 3, leading to 



f{t) = In 



p{J)dJ 



(19) 



p{J) = 



ae 



-a\J\ 



(15) 



The mean (typical) value of eo is — 1/a, and again eo < 
but now all negative values of eg can arise. A simple 
calculation leads to 



Here again one sees that the LDF is sensitive to p{J) and 
thus is not universal. When we consider instead the large 
N shape of the distribution of |Z\jv|, we see that it follows 
a Weibull [9] distribution if p(J) is smooth; such a limit 
law is universal... 



2.3 Case of general spin-glass models 



f[t) = -at - 1 - ln(Q;|i|) for t<0 (16) For an arbitrary Ising spin glass, the Hamiltonian is 



and f{t) = cxD for f > 0. We now have a logarithmic diver- 
gence as f ^ 0~ while the behavior when t — > — oo follows 
that of — lnp(— |J| = t), i.e., f{t) ^ —at. This is a general 
feature: when t — !■ — oo in any spin-glass model, we must 
have that all the Js become large; furthermore there is 
no more frustration, that is we reach the ferromagnetic 
limit. Let's consider this explicitly in the case where p{J) 
is a Gaussian of zero mean and unit variance. A simple 
computation shows that /(t) « t^/2 — ln2 at large nega- 
tive t which has the same divergence as — lnp(— | J| = t)\ 
this pattern will hold in all spin-glass models. The gen- 
eral picture is then that f{t) = oo for t > while the 
t — oo limit simply reflects the asymptotics of p{J). 
This last property shows explicitly why large deviations 
in spin glasses are not universal. Note that this is in sharp 
contrast to what happens for the limiting shape of the dis- 
tribution of E{}\ In the case we are considering, Eq is the 
sum of independent variables; the central limit theorem 
then shows that the distribution's shape becomes Gaus- 
sian at large N. 



2.2 Minimum of independent variables 

Another quantity of interest in spin glasses is the domain 
wall energy A. This energy is simply the change in the 
ground-state energy when applying anti-periodic rather 
than periodic boundary conditions. Using the notation of 
the previous paragraphs, it is clear that for a chain of 
spins 

An = sign(JJ Jj,i+i) min{| Ji,j+i|} . (17) 



The distribution of A is easily obtained in terms of p, 
especially if p{J) is even in J. Limiting ourselves to that 
case, introduce an integrated distribution function gjv(i) 
of|Z\^|: 

/•CO 

qN{t)=2 pN{AN)dAN =q^-\t) for t>0. 
J\t\ 



H{{Si}) = - ^ JijSiSj 



(20) 



where the Jy arc i.i.d. random variables. The groimd-state 
energy Eq is the minimum value of H when considering 
all the 2^ assignments of the spin values; the main dif- 
ference with the domain wall energy case just treated is 
that these 2^^ random values are correlated. It has been 
proven for at least some classes of spin glasses [1] that 
Eq/N converges in probability when N oo {Eq is self- 
averaging). Our interest in this work is to find out empir- 
ically whether this convergence is exponential or not, i.e., 
whether there is a LDP. It is difficult to motivate a LDP 
from the point of view of minimization over 2^ highly cor- 
related assignments; instead it is better to formulate the 
problem differently as follows. Rewrite Eq. 20 as a sum of 
N terms: 



H{{Si}) = Y,- 



(21) 



(18) 



The ground-state energy Eq is then the sum of these N 
terms when the 5, are set to their ground-state values. 
These different terms are correlated as is clear from the 
fact that a given Si appears in several such terms, and in 
contrast to the one dimensional chain, there is no way to 
change variables so that the terms become independent. 
Nevertheless, correlations are relatively weak if H is local 
(that is if the interactions arc short range), and probably 
weak correlations do not spoil the existence of a LDP. In 
fact, clues to this effect come from extensions of Cramer's 
theorem, leading us to try to test for the presence of a LDP 
in our systems. We now do this in a "hands-on" fashion, 
using numerical analysis on different spin- glasses models. 



3 Migdal-KadanofT lattices 

We begin with the family of models following from the 

Migdal-Kadanoff (MK) approach [10,11] where one per- 
forms a bond-moving real-space renormahsation group. 
This procedure effectively amounts to computing quanti- 
ties on hierarchical (MK) lattices defined recursively. The 
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g=o 

Fig. 1. Construction of a Migdal-KadanofT lattice having 6 = 4 
branches and I = 2 segments. 



recursion takes one bond (that is an edge of the current 
graph) into b paths, each made of I segments. The first 
iteration is represented in Fig. 1 for & = 4, / = 2. In what 
follows we restrict ourselves to the case 1 = 2. Given such 
a "lattice" , Ising spins are placed on its sites. The system's 
Hamiltonian is then 



Hj 



SiSj 



(22) 



where Si = ±1 and (ij) restricts the sum to nearest neigh- 
bors. The Jij are quenched i.i.d. random variables assigned 
to the edges. In general, they are either Gaussian or bi- 
modal {Jij = ±1). 

Compared to the systems we will consider further on, 
MK lattices have the advantage of allowing an exact re- 
cursion for the ground-state energy. Thus both analytic 
and numerical computations of the LDF can be performed 
without too much difficulty, even though Eq remains a 
sum of dependent random variables. 



3.1 Definitions 



The recursion for a 6-branch MK lattice with I 
{g + l)th stage is: 



2 at the 



E. 



ip) 



E. 



(a) 
S+1 



E 

fe=i 



b 

E 

fe=i 



min{4P)(l,A:) + 4P)(2,fc), 
E(f\l,k) + E(f\2,k)} 

min{4'^)(l,fc)+4^')(2,fc), 
M^')(l,fc)+M'^)(2,fc)} 



(24) 



(25) 



where the index 1,2 refers to the two bonds forming the 
fcth branch. These equations say that for each of the b 
branches, one has to choose the orientation of the middle 
spin such that the energy is minimized given the orienta- 
tions of the external spins. Note that the terms summed 
are independent as they live on different branches. 

The analysis of these recursions requires the study of 
the joint distribution of E^p") and E^'^^ which is a com- 
plicated problem. It simplifies a bit if one takes A = 
E(^) - E^P^ and E = E^p^ + as the new variables: 
this gives an autonomous equation for A: 



b 

E 

fe=i 



(26) 



sign[zig(l, k)Ag{2, k)] mm{\Ag{l, k)\, \Ag{2, k)\} 



E 

fe=i 



^^(l, k) + Eg{2, k) - m&^{\Ag{l, k)l \Ag{2, k)\} 



(27) 



For our work, we consider distributions of the that 
arc symmetric about 0; that of A is then also symmetric 
about 0. 



If g is the recursion level (beginning with g = as shown 
in Fig. 1), then the "linear" lattice size (or diameter) Lg 
is 2^ and the volume Ng (actually the number of edges 
or terms contributing to the energy) is (2&)^. The space 
dimension d is obtained via the identity Ng = Lg so one 
has d — 1 + In 6/ In 2. The usual choice for = 3 is 6 = 
4, ; = 2, while b = l = 2 gives d = 2. 

In the recursion relation for the ground-state energy, 



one needs to keep track of two energies: E^^^ and E^^l 
These give the MK lattice groimd-statc energy at the 5th 
generation when the two exterior spins are respectively 
parallel and antiparallel. The ground-state energy Eq then 
reads: 



3.2 Domain wall energies 

The A introduced in Eq. 27 gives the domain wall energy 
for the given sample. The meaning of this quantity is best 
understood through the ferromagnetic case presented in 
Fig. 2: it is the smallest energy (in absolute value) of do- 
main walls separating the sample into left and right. The 
domain wall energy is a central quantity in the theory of 
spin glasses: if the typical value of A diverges with L, one 
has a true spin-glass phase [12,13]. 

Interestingly, A is not self averaging, in contrast to Eq. 
Nevertheless, one may still consider the large deviations 
of this quantity. First, notice that the number of terms 
that contribute to A is 0{L'^~^) in MK lattices; the ferro- 
magnetic case thus leads to Z\ = 0{L'''~^). This suggests 
we consider the scaling Ag ~ -i'g"^ when g ^ oo. We thus 
define Xg as Ag/L'i~^ and hope for a LDP of the type 



Eq = min{£;(P),^('')} . 



(23) 



p{xg = - e-^-'/W 



(28) 



A. Andreanov et al.: Large deviations in spin-glass ground-state energies 



5 




Periodic B.C. 




Anti-P.B.C 



Fig. 2. Domain-wall for the ferromagnetic case: the shaded 
region represents the spins that have flipped. 



Let's rewrite the autonomous Eq. 27 for 



1 ^ 



k=l 



sign Xg{l,k)xg{2,k) min{|a:g(l, fc)|, |xg(2, fc)|} 



Unlike the case in one dimension, it is impossible to apply 
the technique of distribution functions because of the sum 
over k terms; similarly, a Fourier transformation is useless 
because of the minimization. Nevertheless, some progress 
can be made by noting that the right side of Eq. 29 is 
a partial mean oi Xg{a,k), a = 1,2. Then one can show 
that if Xg satisfies the LDP with LDF f{t), then so does 
Xg+i: once a LDP is formed, it is conserved. Empirically, 
the LDP turns out to be exactly as expected in Eq. 28. 
Thus, we believe a limit f{t) = —\imN~^lnpg{xg = t) 
as Ng ^ oo exists. Clearly, the LDF isn't universal; it 
depends on the initial distribution po of xq ■ 

To obtain an analytical estimate of f{t), we notice that 
as & ^ oo the LDP should be formed after just one gen- 
eration. An approximation fa{t) for the LDF f{t) is then 



fait) 



1 

— max ■ 
2 A 



At - In 



dse^^Pois) J dvpo{v) 



(30) 



where the argument of the logarithm is the characteris- 
tic function of sign(xo(l)xo(2)) min{|xo(l)|, |a;o(2)|}. This 
formula arises from the direct application of Cramer's the- 
orem for Eq. 29 with g — 0, regarding the quantities 
sign(xo(l)2;o(2)) min{|xo(l)|, |a;o(2)|} as the i.i.d. variables 
(recall that they live on different branches). 

How good is this approximation, and in particular does 
it become exact at large 6? Note that the LDF (30) is h- 
independent because we took 6 — s- oo and we expect in 
fact /a(t) to be the limiting value of j(t) as 6 — > oo. A 
series of numerical simulations were carried out in order to 
quantify the large h approximation. We used the bimodal 
distribution J,,, — ±1: 



P^{J)^-W-^)^KJ 



1)) 



(31) 



0.35 r- 

0.3 - 

0.25 - 
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0.05 - 



b=4: simulations 
b=10: simulations 
theory 



0.2 0.4 0.6 0.8 



1.2 1.4 1.6 



Fig. 3. 

when Ji 



Domain-wall LDF for the Migdal-Kadanoff lattices 
±1. Displayed are data for 6 = 4, 6 = 10, and the 



(29) 

theoretical h — oo prediction 



The evaluation of Eq. 30 for |t| < 2 then gives: 



!ait) = - 



In 2 + In V4-i2 - - In , 

4 V 2 + t 



2-t 



(32) 



which is even as it must; one also has fa{t) = +oo for 
|t| > 2 which is good since it also holds necessarily for the 
exact f{t). 

The simulations revealed a rapid convergence with gen- 
erations of the large deviations functions to their limits: 
g — 5 was enough to nearly reach machine precision even 
when b — A. Furthermore, as illustrated in Fig. 3, the 
discrepancy 6f{t) between the theoretical prediction fa{t) 
and the numerically determined LDF decreased with in- 
creasing b (a few percent for & = 10 to fractions of a per- 
cent for b = 50). The numerical analysis of 5f suggests 
that the corrections are of order The behaviour of 

b^-^'^^5f{t) is shown in Fig. 4 for several values of b. Nat- 
urally, it would be of interest to understand the origin of 
this exponent. It is also interesting to note that the range 
of t is bounded, — 2 < t < 2, but that the derivative of / at 
these limit points is finite, in contrast to what happened 
in the one-dimensional model. 



3.3 Ground-state energies 



The study of Eq is more complicated as it has no au- 
tonomous recursion equation: we must follow the joint 
distribution of S and A. Recall that 



Eq^{S~ A)/2 



(33) 



since Ag scales as 



whereas S. 



scales as L^, the A 



term in Eq. 33 can be neglected in the study of large devi- 
ations. Thus, the large deviations of S and of £^o coincide, 
allowing us to focus hereafter on S. 
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We define the intensive variable = Sg/Ng and rewrite 
Eq. 27 as: 

1 ^ 1 

^5+1= AEfo(^s(l,^)+4(2,fc)) 

%=i ^ (34) 

- ^max{|a;g(l,A;)|,|a;g(2,A;)|} . 
A LDP is expected as Ng ^ oo: 

P{^g =t)^ CM-Ngf{t)) . (35) 

Note that S, the disorder average of E, has a non-zero 
mean value; it can be written in terms of the statistics of 
the X via iteration of Eq. 34: 

3 I 

4+1 = - E ^"iax{|xfe(l)|, \xk{2)\} (36) 

where the index p labels generations. This identity leads 
us to guess that the statistics of ^ is determined entirely 
by that of x. Indeed, it is possible to write down explicitly 
^g+i in terms of the Xp, < p < g: 

^g+i = 




max{|xp(l,m)|, \xp{2,m)\} 



(37) 

where the different x's have been regrouped according to 
their generation number. Evidently the terms within the 
sum over m are independent, but the terms for different 
generations are correlated: indeed, each Xp^i is a sum over 
a sub-set of the Xp's. Thus the problem hasn't really be- 
come easier: instead of the joint distribution of ^ and x, 
we are obliged to consider the joint distribution of the Xp 
at all p < g. On the other hand it is reasonable to expect 
weak cross-generation correlations for 6 ^ oo. This sug- 
gests the approximation of independent generations: take 



all the terms in Eq. 37 to be uncorrelated! Such an ap- 
proximation leads to the estimate f{t) of the LDF: 

m = ltwp'-K'^{2^) ■ 

In this equation, M™^^ represents the characteristic func- 
tion for max{|.Tp(l)|, |.T/j(2)|}. To actually compute /, we 
started with the numerically determined distribution of 
the Xp and performed the sum in the series as far as pos- 
sible and then estimated the remaining part from asymp- 
totics. 

Numerical simulations were carried out to determine 
the distributions of S and of the Xp. As before, we used 
the bimodal Jij = ±1 distribution of couplings. Although 
this time it was much harder to compute the distribution 
of S when g grew, again a good convergence in g was 
found in 5 to 6 generations; this is illustrated in Fig. 5. 
Note that since we are dealing with the Jij = ±1 model, 
we clearly have — 2 < ^ < at large g, but in fact the 
upper limit is not reached for finite b. Also, when g is 
finite, the ranges for ^ are slightly different. Finally, as g 
grows, it is difficult to follow the far tail of the distribution 
of E because the probabilities there become extremely 
small. These different effects are responsible for the cut- 
offs in the curves we show. As another general remark, 
consider the ferromagnetic limit. The probability of having 
no frustration in the MK lattice is bounded from below by 
2~^«; thus necessarily /(— 2) is finite whereas f{t) = oo 
as soon as t < —2. 

How does the estimate of / (obtained from the data 
at the largest g we can handle) compare to /, the predic- 
tion of the independent generations approximation? The 
results for 6 = 4 and 10 are shown in Figs. 6 and 7. The 
discrepancy between "theory" and numerics is larger than 
it was for A but the error does decrease slowly as 5 ^ oo, 
giving evidence that the independent generations approx- 
imation becomes exact as 6 — > oo. 
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Fig. 6. Ground-state energy density LDF when 6 = 4 in the 
Jij = ±1 MK model: comparison of the independent genera^ 
tions approximation ("theory") and simulations. 




Fig. 7. Ground-state energy density LDF when 6 = 10 in the 
Jij = ±1 MK model: comparison of the independent genera- 
tions approximation ("theory") and simulations. 



4 The Random Energy Model 

The simplest model of spin glasses is the Random Energy 
Model (REM) [2]. In a system with N spins there are 2^ 
possible assignements of the spins, leading to 2^ possible 
energy levels. The framework of the REM is to consider 
that these energies are independent. The usual choice is 
for each energy level to have the probability density 



1 



Pn{u) = 



whose integrated distribution is 

dv 



: exp(-w^/A^) 



gjv(w) = / 

J u 



exp(-w^/iV) . 



(39) 



(40) 



The ground-state energy Eq for this system is given by: 
EQ = m\n{xi,...,X2N} . (41) 



As such, Eq has follows Gmnbel [9] distribution. However, 
we are mainly interested in large deviations that are asso- 
ciated with events where Eq is far from its typical value. 
Noting that the N dependence of pjv is chosen so that 
£^0 scales linearly with N in the large N limit, we thus 
consider eo = Eq/N: 



eo = — min{a;i,...,a:2iv} . 



(42) 



Since the independent, the integrated distribution 

of eo is: 



P{eo >t}= P{xi > tN} = q% (tN) (43) 



givmg 



p{eo =t)= N2''q''j^-\tN)pN{tN) . 



(44) 



The asymptotic behavior of these quantities is easily eval- 
uated. For t < — ■\/ln2, we have 



p{eQ = t) ~ e" 



-N{t^ -\n2) 



(45) 



in agreement with the fact that the typical value of cq is 
— Vln 2. When — •\/ln2 < t < 0, the nature of the scaling 
is different : 



lnp(eo = t) 



-JV(t^-ln2) 



(46) 



The scaling a,t t < — \/hi2 can be understood by consid- 
ering the probability that just one of the 2^ energies is 
anomalously low. On the contrary, the case t > — •\/ln2 
follows by imposing that all 2^ energies are a bit high. 
We learn from this example that the "normal" scaling of 
Eq. 2 does not always arise, one may have one type of 
LDP for f < eo and another for t>eQ. 



5 The Sherrington-Kirkpatrick model 

The Sherrington-Kirkpatrick [5] model (hereafter SK) is 
the mean field limit of spin glasses where all A'^ spins are 
connected to one another. The Hamiltonian is 



I J Si Sj 



(47) 



the Si = ±1 are Ising spins and the couplings Jy are Gaus- 
sian random variables of zero mean and variance 
(This scaling ensures a good thermodynamic limit, the free 
energy scaling linearly with N.) We focus on the statis- 
tics of the ground-state energy Eq of this Hamiltonian. 
For a given sample, Eq can be determined by combinato- 
rial optimization techniques; we have used a genetic algo- 
rithm [14] which finds Eq with a high level of reliability 
when N is not too large. This is true for the SK model 
and also for the other models we shall consider further on. 
In all these cases, we restrict ourselves to N values where 
the true Eq is almost certainly obtained. We then com- 
pute the statistics of Eq for a number of different system 
sizes N and then attempt to extrapolate to the large N 
limit. 
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5.1 Limiting distribution 

Before examining the large deviations of Eq in the SK 
model, let's first discuss its limiting distribution. The mean 
of Eq scales linearly with N, and has a variance that grows, 
but one may expect the distribution of Eq to have a Um- 
iting shape. To study this, one introduces the scaling vari- 
able 



X.T = 



Eq — Eq 

(t{Eo) 



(48) 



where Eg is the disorder average of Eq and a{Eo) its 
standard deviation. Bouchaud et al. [15] studied the dis- 
tribution of Xj, finding that, for most models of spin 
glasses it becomes Gaussian as A'' ^ oo. However in the 
case of the SK, the behavior was clearly non-Gaussian. 
Since Eq is an extreme statistic, being the minimum of 
2^ energies, it is natural to ask whether the distribu- 
tion of Eq falls into a known universality class. There 
are three standard universality classes for the minimum 
of uncorrelated variables [9]: (1) Gumbel for unbounded 
distributions whose tail decreases faster than any power; 
(2) Fisher-Tippett-Prechet for distributions whose tail de- 
creases as a power; (3) WcibuU for distributions with cut- 
offs. The REM clearly falls into the first class. Interest- 
ingly, the SK does not fall into any of these three classes [15, 
16], Ic^aving open the question of the universality class ap- 
propriate for its Eq. 

Recently a new universality class has been uncovered 
by Tracy and Widom [17] : in the context of random matrix 
theory, they found the limiting distribution of the largest 
eigenvalue of an x random matrix when N ^ oo. 
This distribution is believed to be universal, and has al- 
ready been applied to a number of diflPerent systems. Note 
that we are dealing now with the maximum of a large num- 
ber of correlated random variables; furthermore, because 
of the minus sign in Eq. 47, wc have to consider in fact 
(— 1) times this largest eigenvalue when comparing to Eq 
(note that this changes the sign of the skewness). We re- 
fer to the distribution of this quantity within the Gaussian 
Orthogonal Ensemble as the Tracy- Widom (TW) distri- 
bution. 

How does the distribution of Eq compare with that of 
TW? We find that the agreement is surprizingly good. In 
Fig. 8 wc have plotted the TW distribution as a continu- 
ous curve and our SK data when = 50, 100, 150. In the 
inset we compare the two distributions directly; the naked 
eye sees no difference between the two. In the main part 
of the figure, we zoom on the tails, displaying the data on 
a logarithmic scale; then definite deviations appear on the 
left wing. At a more quantitative level, we have also com- 
pared the values of skewness S and kurtosis K of the TW 
and SK distributions. Our estimations arc summarized in 
Table 1. We find a "large" disagreement; clearly the dis- 
crepancy seen in the tail of Fig. 8 affects these cumulants, 
in fact all the more that they are of higher order. Since 
at least the SK skewness is numerically stable and quite 
different from the TW skewness, we feel that our data in- 
dicate that the two distributions, though very close, are in 




Fig. 8. Comparison of the Tracy- Widom distribution to that of 
the Sherrington-Kirkpatrick model at iV = 50, 100, 150. Data 
shown are on a logarithmic scale (and on a linear scale for the 
inset); Xj is defined in Eq. 48. 

Table 1. Skewness (S) and Kurtosis (K) for the TW distribu- 
tion and for that of Eq in the SK model (numerical estimates 
are for N = 50, 100, and 150). 





TW 


SK50 






s 


-0.293 


-0.43 


-0.42 


-0.41 


K 


0.165 


0.41 


0.36 


0.39 



fact distinct. Thus TW does not give us the universality 
class for SK. 

On the theoretical side, what possible justification could 
be given for Eq to be described by TW since Eq is a mini- 
mum over 2^ variables whereas in the TW matrix problem 
one considers the extremum of A'' variables? One possi- 
ble answer resides in the spherical approximation to the 
SK model. Kosterlitz et al. [18] solved the SK model in 
that approximation and showed that in the limit T 0, 
the Boltzmann weight becomes dominated by the largest 
eigenvalue of the interaction matrix J^ ; this then leads 
to a TW distribution for Eq in that model. Given this 
result, it would be good to understand why the spherical 
approximation seems to be so good for the shape of the 
distribution of Eq yet gives a very poor approximation 
for the ground-state energy density (—1.0 to be compared 
with the actual value of —0.7633). 

To push further the point that we do not believe the 
two distributions to be identical, consider the following 
fact. If we consider the maximum eigenvalue of a random 
symmetric matrix, the result of TW shows that the mean 
shifts with N and the standard deviation grows with A''. 
Translating into the variable Eq, the correspondence with 
random matrix theory would predict that 



Eq - Ne*Q ~ ArV3 



and 



El - Eq 



iV2/3 



(49) 



(50) 
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where eg is the thermodynamic limit of Eq/N. The first 
equation is in agreement with what happens in the SK 
model [19,15], but the standard deviation does not at all 
grow as N^/^ but much more slowly [15,16]. 



5.2 Large deviations 

Still staying within the SK model, we now turn to the large 

deviations of eo = Eq/N. The central question is whether 
there is a LDP. A naive expectation would be that 



PJv(eo) 



-Nfieo) 



(51) 



but our data do not follow such a scaling. By considering 
the dependence of InpAr(eo) on N, our data suggest two 
quite distinct power laws in N: N^-^ when eo > cq and 
A'"^-^ when eo < cq. We thus define two LDF: 



f>ieo) = - 



ln[piv(eo)/iVO- 
7V1.5 



751 



in the first case and 



/<(eo) 



In [pjv(eo)/7V°-6 



(52) 



(53) 



in the second. The additional rescaling in the log has no 
effect on the large N scaling, but we found it to reduce 
finite size effects; note that the motivation for this term 
comes from Bahardur-Rao theorem given in Section 2. We 
plot in Fig. 9 these two functions modulo a shift of the x 
axis. 

It is not surprizing that the exponent found for /< is 
less or equal to 1.5. Indeed, consider for instance a sample 
with eo < cq. Now change the sign of 0{N^-^) Jij that are 
unsatisfied; this will "cost" a probability 0(exp [— ^A''^-^] ). 
With this change, we have cq eo — (5 where 5 is finite, i.e., 
the desired result. The fact that the exponent turns out 
to be smaller than 1.5 probably comes from the fact that 
there are many ways to choose these 0{N^-^) Jij whereas 
the argument uses only one such choice. 

Because we are restricted to N not very large, we can- 
not exclude the possibility that we are seeing effective ex- 
ponents and that a different scaling arises at larger N. 
(In contrast to the MK case, the large deviations become 
much more difficult to measure when TV grows and our 
algorithm for determining Eq also breaks down at large 
A''.) Nevertheless, since our scalings work well, the data 
strongly suggest that there really are two different expo- 
nents for eo < Bq and eo > Cq, just as in the REM. 



5.3 Very large deviations 

In the one-dimensional example of Section 2.1 and in the 
MK model, the ferromagnetic limit arose when eo <^ eg. 
In the SK model, the absence of frustration requires con- 
straining 0{N'^) Jij and changes the scaling of eo which 
becomes 0{N^/^). Clearly such deviations are far rarer 
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Fig. 9. Large Deviations Functions /< and /> for the 
Sherrington-Kirkpatrick model when TV = 30, 50, . . . , 300. 
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Fig. 10. Very Large Deviations Function for the Sherrington- 
Kirkpatrick model: data for 40 < JV < 100. 



than simply having eo <C e^. Because of this, we are lim- 
ited in our numerical study to even smaller TV values than 

before. Nevertheless, we have investigated these deviations 
by considering the probability distribution of eo/TV^/^. We 
find 



PN{eo/N'/^) ^ TVe-^'/(^«/^ 



1/2 



(54) 



with a very large deviations function (VLDF) / displayed 
in Fig. 10. (To reduce finite TV effects, we have in fact used 
eo — eo rather than eo itself, but this does not change the 
asymptotics.) We see that for positive arguments, the esti- 
mates for / increase with TV which is compatible with the 
expectation that f{t) = oo when t > 0. For negative argu- 
ments, the data collapse quite well, justifying the VLDP 
proposed in Eq. 54. 



6 Other spin-glass models models 

Most probably, the subtleties of the SK model arise from 
the fact that all spins are connected to one-another. In 
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Fig. 11. Large Deviations Function in the Edwards- Anderson Fig. 12. Large Deviations Function for fixed connectivity ran- 
model with Gaussian couplings in d = 2. The solid line is the dom graphs when z = 10. Inset: the case z = 3. 
limiting curve for N oo. Inset: the d = 3 case. 



more realistic models, spins interact with just a few neigh- 
bors. We thus return to models of the type described by 
the Hamiltonian in Eq. 22, and now impose the num- 
ber of neighbors to be fixed and constant. The two stan- 
dard classes of models that do that are the Edwards- 
Anderson [20] model on square or cubic lattices and the 
mean field fixed connectivity models [21]. At present, no 
analytical tools are available for treating large deviations, 
so we resort to a purely numerical analysis. In what fol- 
lows, the disorder variables Jij of Eq. 22 are i.i.d. Gaussian 
random variables of zero mean and unit variance. 



6.1 Euclidean models 

We first look at the LDP of cq for the Edwards- Anderson 
model with Gaussian couplings in two and three dimen- 
sions. In the case of finite dimensional lattices, we know 
that the variance of Eq grows linearly with TV, and most 
probably the distribution of Eq is Gaussian [22]. This sug- 
gests that the terms contributing to the ground-state en- 
ergy are nearly independent, just as in the MK case. Then 
the functions 

-ln[ff^(eo)/iVV^] 
/jv(eo) = — (55) 

should converge as ^ oo. We first checked this on our 
3 < L < 10 data in the two-dimensional {d = 2) case. 
Under the hypothesis that the finite size corrections are 
0(1/A^), we extract the limiting function foo ior N oo. 
This function is shown in Fig. 11 as a continuous curve. 
The LDF is asymmetric, increasing more rapidly on the 
right than on the left. We also show our data for the Jn, 
showing that the scaling in Eq. 55 works well. The = 3 
case is displayed as an inset; note that the LDF is more 
symmetric than for d = 2. Also, the convergence of /jv to 
again goes as 0{1/N). 



6.2 Fixed connectivity random graphs 

We now focus on random graphs that are of fixed con- 
nectivity z at each site. Such graphs can be generated by 
the following simple algorithm. First construct N vertices, 
then successively add edges to the graph by randomly con- 
necting sites whose current connectivity is less than z. 
This process can be made more efficient by keeping a list 
of all the candidate sites. The size of this list decreases; 
if it reaches 1, the construction has to be abandoned and 
restarted. In practice such a failure does not occur very 
often. 

Just as in the Euclidean case, we find the simple scaling 
PAr(eo) ~ AViVe-^^^"") . (56) 

Thus as in the MK case, we are closer to a sum of i.i.d. 
random variables than to a minimum of independent vari- 
ables. In Fig. 12 we display the LDF for the case z = 10; 
the inset is for = 3. We see that the convergence to the 
large N limit is quite good, leading to an envelope curve. 
It should be clear that the two functions displayed are 
distinct: the case z = 3 is a bit more symmetric than the 
case z = 10. 



7 Conclusions 

We have investigated the large deviations of the ground- 
state energy Eq in several Ising spin-glass models. There 
are two extreme cases: (1) Eq is the sum of independent 
random variables (as in a one dimensional lattice); (2) Eg 
is the minimum of a large number of independent random 
variables (as in the Random Energy Model). The case of 
the Sherrington-Kirkpatrick model led to a behavior inter- 
mediate between these two extremes. However, the other 
spin-glass models we considered, all of which have finite 
connectivity, resemble closely the first case. This was par- 
ticularly patent for the hierarchical lattice models where 
high quality numerical computations were possible as well 
as analytical estimates. 
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We also pointed out that the large deviations func- 
tion is not universal; in particular, its tails depends very 
much on the far distribution of the disorder variables Jij . 
This leads one to ask whether there is any universality in 
large deviations functions. This question is all the more 
interesting that Brunet and Derrida [23] found the large 
deviations function for a directed polymer in a random 
medium to be universal; furthermore, that function is the 
same as the one describing infinitesimal deviations in their 
system. To have this kind of universality in spin glasses, 
the deviations must not affect much the values taken by 
the Jij] for us, this seems possible only in the Shcrrington- 
Kirkpatrick model. More precisely, we believe that the 
regime E{) — 0{N) in that model is universal (indepen- 
dent of the details of the Jij); this is to be contrasted with 
the ferromagnetic limit where clearly the large deviations 
function is not universal. 

Many questions surrounding these issues are still very 
open; let us list a few. (0) What are the analytical proper- 
ties of the large deviation function, and when is it related 
to the distribution of infinitesimal deviations? (1) When 
the large deviations function is finite, we found a quite 
smooth convergence of our finite N estimates to a limit- 
ing curve; is this convergence uniform? (2) What is the 
distribution of the disorder variables Jij when eo deviates 
from its typical value? (3) In the ferromagnetic limit, do 
the regions with frustration phase separate out? (We be- 
lieve not.) Clearly, numerical analysis is not a good tool to 
tackle most of these questions. Fortunately, there is good 
reason to believe that Migdal-Kadanoff lattices provide a 
good framework to address these questions; we hope that 
analytical results will be derived in the near future for 
such spin-glass models. 

Acknowledgements - We thank A. Pagnani for his help 
throughout this project and C. Tracy for providing us with 
his Mathematica program for generating the TW distri- 
bution. We also thank O. Bohigas, J. -P. Bouchaud, B. 
Derrida, E. Marinari, M. Mezard, R. Monasson and G. 
Parisi for their critical comments. This work was sup- 
ported in part by the European Community's Human Po- 
tential Programme (contracts HPRN-CT-2002-00307 for 
DYGLAGEMEM and HPRN-CT-2002-00319 for STIPCO) 



8. J. Bucklew, Large deviations techniques in decision, sim- 
ulation, and estimation (John Wiley and sons. New York, 
NY, 1990). 

9. E. J. Gumbel, Statistics of Extremes (Columbia University 
Press, NY, 1958). 

10. B. W. Southern and A. P. Young, J. Phys. C 10, 2179 
(1977). 

11. A. N. Berker and S. Ostlund, J. Phys. C: Solid State Phys. 
12, 4961 (1979). 

12. A. J. Bray and M. A. Moore, in Heidelberg Colloquium 
on Glassy Dynamics, Vol. 275 of Lecture Notes in Physics, 
edited by J. L. van Hemmen and I. Morgenstern (Springer, 
Berlin, 1986), pp. 121-153. 

13. D. S. Fisher and D. A. Huse, Phys. Rev. Lett. 56, 1601 
(1986). 

14. J. Houdayer and O. C. Martin, Phys. Rev. Lett. 83, 1030 
(1999), cond-mat/9901276. 

15. J. -P. Bouchaud, F. Krzakala, and O. C. Martin, Phys. Rev. 
B 68, 224404 (2003), cond-mat/0212070. 

16. M. Palassini, (2003), lecture at Les Houches. 

17. C. Tracy and H. Widom, Statistical Physics on the Eve of 
the 21st Century 230 (1999). 

18. J. Kostcrlitz, D. Thouless, and R. Jones, Phys. Rev. Lett. 
36, 1217 (1976). 

19. M. Palassini, Ph.D. thesis, Scuola Normale Superiore, Pisa, 
Italy 2000. 

20. S. F. Edwards and P. W. Anderson, J. Phys. F: Met. Phys. 
5, 965 (1975). 

21. C. De Dominicis and Y. Goldschmidt, J. Phys. A 22, L775 
(1989). 

22. J. Wchr and M. Aizcnman, J. Stat. Phys. 60, 287 (1990). 

23. E. Brunet and B. Derrida, Phys. Rev E 61(6), 6789 (2000). 



References 

1. M. Lifshitz, S. A. Gredeskul, and L. A. Pastur, Introduc- 
tion to the theory of disordered systems (Wiley, New York, 
1988). 

2. B. Derrida, Phys. Rev. Lett 45, 79 (1980). 

3. B. Derrida, J. Phys. Lett. (France) 46, 401 (1985). 

4. A. Bovicr and I. Kurkova, Markov Processes Rcl. Fields 9, 
209 (2003). 

5. D. Sherrington and S. Kirkpatrick, Phys. Rev. Lett. 35, 
1792 (1975). 

6. M. Talagrand, Probability Theory and Related Fields 117, 

303 (2000). 

7. W. Feller, An introduction to probability theory and its ap- 
plications (John Wiley and sons. New York, NY, 1950). 



